Search CORE

21 research outputs found

Predicting protein function with hierarchical phylogenetic profiles: The Gene3D phylo-tuner method applied to eukaryotic Genomes

Author: Grant A
Orengo CA
Ranea JAG
Yeats C
Publication venue: PUBLIC LIBRARY SCIENCE
Publication date: 30/11/2007
Field of study

"Phylogenetic profiling'' is based on the hypothesis that during evolution functionally or physically interacting genes are likely to be inherited or eliminated in a codependent manner. Creating presence-absence profiles of orthologous genes is now a common and powerful way of identifying functionally associated genes. In this approach, correctly determining orthology, as a means of identifying functional equivalence between two genes, is a critical and nontrivial step and largely explains why previous work in this area has mainly focused on using presence-absence profiles in prokaryotic species. Here, we demonstrate that eukaryotic genomes have a high proportion of multigene families whose phylogenetic profile distributions are poor in presence-absence information content. This feature makes them prone to orthology mis-assignment and unsuited to standard profile-based prediction methods. Using CATH structural domain assignments from the Gene3D database for 13 complete eukaryotic genomes, we have developed a novel modification of the phylogenetic profiling method that uses genome copy number of each domain superfamily to predict functional relationships. In our approach, superfamilies are subclustered at ten levels of sequence identity from 30% to 100% - and phylogenetic profiles built at each level. All the profiles are compared using normalised Euclidean distances to identify those with correlated changes in their domain copy number. We demonstrate that two protein families will "auto-tune'' with strong co-evolutionary signals when their profiles are compared at the similarity levels that capture their functional relationship. Our method finds functional relationships that are not detectable by the conventional presence - absence profile comparisons, and it does not require a priori any fixed criteria to define orthologous genes

UCL Discovery

Comparative evolutionary analysis of protein complexes in E. coli and yeast

Author: Orengo CA
Ranea JAG
Reid AJ
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

Background: Proteins do not act in isolation; they frequently act together in protein complexes to carry out concerted cellular functions. The evolution of complexes is poorly understood, especially in organisms other than yeast, where little experimental data has been available.Results: We generated accurate, high coverage datasets of protein complexes for E. coli and yeast in order to study differences in the evolution of complexes between these two species. We show that substantial differences exist in how complexes have evolved between these organisms. A previously proposed model of complex evolution identified complexes with cores of interacting homologues. We support findings of the relative importance of this mode of evolution in yeast, but find that it is much less common in E. coli. Additionally it is shown that those homologues which do cluster in complexes are involved in eukaryote-specific functions. Furthermore we identify correlated pairs of non-homologous domains which occur in multiple protein complexes. These were identified in both yeast and E. coli and we present evidence that these too may represent complex cores in yeast but not those of E. coli.Conclusions: Our results suggest that there are differences in the way protein complexes have evolved in E. coli and yeast. Whereas some yeast complexes have evolved by recruiting paralogues, this is not apparent in E. coli. Furthermore, such complexes are involved in eukaryotic-specific functions. This implies that the increase in gene family sizes seen in eukaryotes in part reflects multiple family members being used within complexes. However, in general, in both E. coli and yeast, homologous domains are used in different complexes

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

UCL Discovery

PubMed Central

Finding the "Dark Matter'' in Human and Yeast Protein Network Prediction and Modelling

Author: Clegg AB
Lees JG
Morilla I
Orengo C
Ranea JAG
Reid AJ
Sanchez-Jimenez F
Yeats C
Publication venue: PUBLIC LIBRARY SCIENCE
Publication date: 23/09/2010
Field of study

Accurate modelling of biological systems requires a deeper and more complete knowledge about the molecular components and their functional associations than we currently have. Traditionally, new knowledge on protein associations generated by experiments has played a central role in systems modelling, in contrast to generally less trusted bio-computational predictions. However, we will not achieve realistic modelling of complex molecular systems if the current experimental designs lead to biased screenings of real protein networks and leave large, functionally important areas poorly characterised. To assess the likelihood of this, we have built comprehensive network models of the yeast and human proteomes by using a meta-statistical integration of diverse computationally predicted protein association datasets. We have compared these predicted networks against combined experimental datasets from seven biological resources at different level of statistical significance. These eukaryotic predicted networks resemble all the topological and noise features of the experimentally inferred networks in both species, and we also show that this observation is not due to random behaviour. In addition, the topology of the predicted networks contains information on true protein associations, beyond the constitutive first order binary predictions. We also observe that most of the reliable predicted protein associations are experimentally uncharacterised in our models, constituting the hidden or "dark matter'' of networks by analogy to astronomical systems. Some of this dark matter shows enrichment of particular functions and contains key functional elements of protein networks, such as hubs associated with important functional areas like the regulation of Ras protein signal transduction in human cells. Thus, characterising this large and functionally important dark matter, elusive to established experimental designs, may be crucial for modelling biological systems. In any case, these predictions provide a valuable guide to these experimentally elusive regions

UCL Discovery

Bacterial Genomes: Habitat Specificity and Uncharted Organisms

Author: A Bernal
C Pedrós-Alió
D Wu
EA Dinsdale
FE Angly
Fernando Dini Andreote
Francisco Dini-Andreote
GR Burke
H Toh
J Raes
JA Gilbert
Jack T. Trevors
JAG Ranea
Jan Dirk van Elsas
JE Barrick
JK Harris
JT Trevors
L Oksana
L Philippot
M Touchon
M Wagner
ML Sogin
NR Pace
P Lapierre
P Yilmaz
PKH Lee
RT Jones
S Abby
SG Tringe
T Ishoey
T Woyke
TM Vogel
Welington Luiz Araújo
Publication venue: Springer-Verlag
Publication date: 01/01/2012
Field of study

The capability and speed in generating genomic data have increased profoundly since the release of the draft human genome in 2000. Additionally, sequencing costs have continued to plummet as the next generation of highly efficient sequencing technologies (next-generation sequencing) became available and commercial facilities promote market competition. However, new challenges have emerged as researchers attempt to efficiently process the massive amounts of sequence data being generated. First, the described genome sequences are unequally distributed among the branches of bacterial life and, second, bacterial pan-genomes are often not considered when setting aims for sequencing projects. Here, we propose that scientists should be concerned with attaining an improved equal representation of most of the bacterial tree of life organisms, at the genomic level. Moreover, they should take into account the natural variation that is often observed within bacterial species and the role of the often changing surrounding environment and natural selection pressures, which is central to bacterial speciation and genome evolution. Not only will such efforts contribute to our overall understanding of the microbial diversity extant in ecosystems as well as the structuring of the extant genomes, but they will also facilitate the development of better methods for (meta)genome annotation

Crossref

Proceedings - University of Groningen

University of Groningen

Springer - Publisher Connector

ARTS repository - University of Groningen

PubMed Central

Dissertations of the University of Groningen

ProPhylo: partial phylogenetic profiling to guide protein family construction and assignment of biological process

Author: CJ Stubben
D Barker
D Barker
D Haft
D Szklarczyk
DA Rodionov
Daniel H Haft
DH Haft
DH Haft
DH Haft
EM Marcotte
F Eckstein
F Enault
GV Glazko
H-Y Ou
J Sun
J Wu
J-P Vert
JAG Ranea
JD Selengut
JD Selengut
JD Selengut
Jeremy D Selengut
L Ferrer
M Csurös
M Huynen
M Pellegrini
MA Huynen
Malay K Basu
MS Gelfand
P Pagel
PM Bowers
PR Kensche
PS Dehal
R Jothi
RL Tatusov
S Briesemeister
S Freilich
SR Eddy
SV Date
SV Date
T Blum
T Gaasterland
T Xu
T Yamada
X Brazzolotto
Y Hong
Y Liu
Y Zhou
Z Jiang
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Crossref

Springer - Publisher Connector

PubMed Central

Ancient horizontal gene transfer and the last common ancestors

Author: A Barzel
A Burt
A Hua-Van
A Sauerwald
A Stoltzfus
AM Barbaglia
B Boussau
B Reisinger
C Darwin
Cheryl P Andam
CP Andam
CP Andam
CR Woese
CR Woese
D Darriba
D Williams
DA Benson
DH Rothman
EGJ Danchin
EV Koonin
EV Koonin
EV Koonin
G Borrel
G Borrel
G Eriani
G Srinivasan
GJ Szöllosi
GM Nagel
GP Fournier
GP Fournier
GP Fournier
Gregory P Fournier
H Grosjean
J Peretó
J Thomas
JA Krzycki
JA Krzycki
JAG Ranea
JL Siefert
JM Kavran
Johann Peter Gogarten
JP Gogarten
K Swithers
K Vetsigian
L Olendzenski
L Ribas de Pouplana
L Ribas de Pouplana
L Sinzelle
LS Frost
M Ibba
M Khomyakova
M Syvanen
M Wu
MH Mazauric
MV Omelchenko
N Lartillot
N Nameki
O Penn
O Zhaxybayeva
P Kück
P Kück
P O’Donoghue
P O’Donoghue
P Schimmel
R Dawkins
R Jain
RC Edgar
S Bilokapic
S Gould
S Guindon
S Herring
S Herring
S Morris
S Osawa
SQ Le
T Tuller
TJ Treangen
VV Kapitonov
WM Fitch
Y Diaz-Lazcoz
Y Zhang
YI Wolf
Z-P Fang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 22/04/2015
Field of study

Background The genomic history of prokaryotic organismal lineages is marked by extensive horizontal gene transfer (HGT) between groups of organisms at all taxonomic levels. These HGT events have played an essential role in the origin and distribution of biological innovations. Analyses of ancient gene families show that HGT existed in the distant past, even at the time of the organismal last universal common ancestor (LUCA). Most gene transfers originated in lineages that have since gone extinct. Therefore, one cannot assume that the last common ancestors of each gene were all present in the same cell representing the cellular ancestor of all extant life. Results Organisms existing as part of a diverse ecosystem at the time of LUCA likely shared genetic material between lineages. If these other lineages persisted for some time, HGT with the descendants of LUCA could have continued into the bacterial and archaeal lineages. Phylogenetic analyses of aminoacyl-tRNA synthetase protein families support the hypothesis that the molecular common ancestors of the most ancient gene families did not all coincide in space and time. This is most apparent in the evolutionary histories of seryl-tRNA synthetase and threonyl-tRNA synthetase protein families, each containing highly divergent “rare” forms, as well as the sparse phylogenetic distributions of pyrrolysyl-tRNA synthetase, and the bacterial heterodimeric form of glycyl-tRNA synthetase. These topologies and phyletic distributions are consistent with horizontal transfers from ancient, likely extinct branches of the tree of life. Conclusions Of all the organisms that may have existed at the time of LUCA, by definition only one lineage is survived by known progeny; however, this lineage retains a genomic record of heterogeneous genetic origins. The evolutionary histories of aminoacyl-tRNA synthetases (aaRS) are especially informative in detecting this signal, as they perform primordial biological functions, have undergone several ancient HGT events, and contain many sites with low substitution rates allowing deep phylogenetic reconstruction. We conclude that some aaRS families contain groups that diverge before LUCA. We propose that these ancient gene variants be described by the term “hypnologs”, reflecting their ancient, reticulate origin from a time in life history that has been all but erased”.National Science Foundation (U.S.) (Grant DEB 0830024)Exobiology Program (U.S.) (Grant NNX10AR85G)United States. National Aeronautics and Space Administration (Postdoctoral Program

DSpace@MIT

Crossref

Springer - Publisher Connector

PubMed Central

The GAAS Metagenomic Tool and Its Estimations of Viral and Microbial Average Genome Size in Four Major Biomes

Author: AC Paoletti
Alejandra Prieto-Davó
B Diez
B Zybailov
Baoli Zhu
Beltran Rodriguez-Mueller
C Desnues
Christelle Desnues
D Rasko
D Willner
Dana Willner
David L. Kirchman
DH Huson
Dionysios A. Antonopoulos
DL Wheeler
EA Dinsdale
EA Dinsdale
Egbert Mundt
Elizabeth A. Dinsdale
F Angly
F Meyer
F Rohwer
FE Angly
Florent E. Angly
FM Lauro
Folker Meyer
Forest Rohwer
Gary D. Stormo
GF Steward
I Hewson
I Letunic
J Raes
J Raes
JAG Ranea
John D. McPherson
K Holmfeldt
K Rosario
Katie Barott
KE Wommack
KE Wommack
KT Konstantinidis
L Florens
LB Koski
Linda Wegley
Lixin Zhang
LM Graves
M Dyall-Smith
M Pignatelli
Matthew Haynes
Matthew R. Henn
Matthew T. Cottrell
MG Weinbauer
Mike Furlan
P DasSarma
P Hugenholtz
R Sadreyev
R Sandaa
R Sandaa
R Seshadri
R. Michael Miller
Rebecca Vega-Thurber
Rick Stevens
RL Vega Thurber
Robert A. Edwards
Robert K. Naviaux
Robert Schmieder
RV Thurber
S Karlin
SD Bentley
SF Altschul
Tracey McDole
Yongfei Hu
Publication venue: Public Library of Science
Publication date: 01/01/2009
Field of study

Metagenomic studies characterize both the composition and diversity of uncultured viral and microbial communities. BLAST-based comparisons have typically been used for such analyses; however, sampling biases, high percentages of unknown sequences, and the use of arbitrary thresholds to find significant similarities can decrease the accuracy and validity of estimates. Here, we present Genome relative Abundance and Average Size (GAAS), a complete software package that provides improved estimates of community composition and average genome length for metagenomes in both textual and graphical formats. GAAS implements a novel methodology to control for sampling bias via length normalization, to adjust for multiple BLAST similarities by similarity weighting, and to select significant similarities using relative alignment lengths. In benchmark tests, the GAAS method was robust to both high percentages of unknown sequences and to variations in metagenomic sequence read lengths. Re-analysis of the Sargasso Sea virome using GAAS indicated that standard methodologies for metagenomic analysis may dramatically underestimate the abundance and importance of organisms with small genomes in environmental systems. Using GAAS, we conducted a meta-analysis of microbial and viral average genome lengths in over 150 metagenomes from four biomes to determine whether genome lengths vary consistently between and within biomes, and between microbial and viral communities from the same environment. Significant differences between biomes and within aquatic sub-biomes (oceans, hypersaline systems, freshwater, and microbialites) suggested that average genome length is a fundamental property of environments driven by factors at the sub-biome level. The behavior of paired viral and microbial metagenomes from the same environment indicated that microbial and viral average genome sizes are independent of each other, but indicative of community responses to stressors and environmental conditions

HAL AMU

Directory of Open Access Journals

HAL Descartes

DigitalCommons@Florida International University

Hal-Diderot

University of Queensland eSpace

Public Library of Science (PLOS)

eScholarship - University of California

ScholarlyCommons@Penn

Combining modularity, conservation, and interactions of proteins significantly increases precision and coverage of protein function prediction

Author: A Guarne
A Sturz
A Vazquez
AL Barabasi
B Boeckmann
B Schwikowski
BE Engelhardt
BJ Dickson
BP Kelley
C von Mering
Christine T Sers
Consortium FlyBase
D Eisenberg
D Frishman
D Lee
D Pal
E Birgbauer
EB Pasquale
F Chen
FM Couto
GD Bader
GD Bader
GM Li
GT Hart
H Hishigaki
HN Chua
HN Chua
J Dutkowski
J Huot
J Jiricny
J Jiricny
J Song
JAG Ranea
JE Stone
JP Himanen
K Dolinski
K Forslund
K Yoshioka
L Giot
L Li
LH Hartwell
LR Matthews
M Ashburner
M Chitale
M Deng
M Deng
M Koyutürk
M Punta
MA Huynen
MC Hall
ME Futschik
MT Dittrich
N Erdeniz
N Ikegaki
NJ Mulder
P Hsieh
P Pagel
R Brambilla
R Llewellyn
R Saeed
R Sharan
R Sharan
Reference Genome Group of the Gene Ontology Consortium
S Jaeger
S Peri
S Sun
Samira Jaeger
SF Altschul
SL Gibson
SM Baker
SM Baxter
SR Kumar
T Hawkins
T Yamada
Ulf Leser
V Spirin
X Zhou
XW Chen
Y Habraken
Y Loewenstein
Y Tao
Publication venue: BioMed Central
Publication date: 01/12/2010
Field of study

Abstract Background While the number of newly sequenced genomes and genes is constantly increasing, elucidation of their function still is a laborious and time-consuming task. This has led to the development of a wide range of methods for predicting protein functions in silico. We report on a new method that predicts function based on a combination of information about protein interactions, orthology, and the conservation of protein networks in different species. Results We show that aggregation of these independent sources of evidence leads to a drastic increase in number and quality of predictions when compared to baselines and other methods reported in the literature. For instance, our method generates more than 12,000 novel protein functions for human with an estimated precision of ~76%, among which are 7,500 new functional annotations for 1,973 human proteins that previously had zero or only one function annotated. We also verified our predictions on a set of genes that play an important role in colorectal cancer (<it>MLH1</it>, <it>PMS2</it>, <it>EPHB4 </it>) and could confirm more than 73% of them based on evidence in the literature. Conclusions The combination of different methods into a single, comprehensive prediction method infers thousands of protein functions for every species included in the analysis at varying, yet always high levels of precision and very good coverage.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Evolution of protein domain architectures

Author: A Heger
A Marchler-Bauer
A Nagy
A Nagy
A Nagy
A Nasir
A Rijk van
A Rzhetsky
A-L Barabási
AD Moore
AD Moore
AD Moore
AH Brivanlou
AR Kersting
B Lee
B Snel
C Bru
C Chothia
C Feschotte
C Haider
C Vogel
C Vogel
C-H Hsu
C-H Hsu
CM Zmasek
D Ekman
D Wilson
DP Syamaladevi
E Bornberg-Bauer
E Dohmen
E Gogvadze
E Nimwegen van
EE Schmidt
EM Marcotte
EV Koonin
G Apic
G Apic
GP Karev
H Tordai
I Cohen-Gihon
I Letunic
I Yanai
J Gough
J Qian
J Weiner
J Weiner
J Weiner III
J Wiedenhoeft
J-M Chandonia
JAG Ranea
JH Fong
JM Eirin-Lopez
JP Demuth
JS Farris
K Forslund
L Grassi
L Leclère
L Li
L Patthy
LY Geer
M Bashton
M Buljan
M Buljan
M d C Orozco-Mosqueda
M Itoh
M Liu
M Sharma
M Stolzer
M Toll-Riera
MA Huynen
MK Basu
MK Basu
N Terrapon
N Vera-Parra
NC Brissett
NL Dawson
NM Luscombe
R Cordaux
RD Finn
RD Finn
RF Doolittle
S Wuchty
S Yang
SD Lam
SK Kummerfeld
SK Kummerfeld
T Bitard-Feildel
T Doğan
T Koestler
T Przytycka
TE Lewis
UniProt Consortium
V Hollich
VA Kuznetsov
W-D Heyer
X Xie
X-C Zhang
Y-C Wu
ÅK Björklund
ÅK Björklund
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

This chapter reviews current research on how protein domain architectures evolve. We begin by summarizing work on the phylogenetic distribution of proteins, as this will directly impact which domain architectures can be formed in different species. Studies relating domain family size to occurrence have shown that they generally follow power law distributions, both within genomes and larger evolutionary groups. These findings were subsequently extended to multi-domain architectures. Genome evolution models that have been suggested to explain the shape of these distributions are reviewed, as well as evidence for selective pressure to expand certain domain families more than others. Each domain has an intrinsic combinatorial propensity, and the effects of this have been studied using measures of domain versatility or promiscuity. Next, we study the principles of protein domain architecture evolution and how these have been inferred from distributions of extant domain arrangements. Following this, we review inferences of ancestral domain architecture and the conclusions concerning domain architecture evolution mechanisms that can be drawn from these. Finally, we examine whether all known cases of a given domain architecture can be assumed to have a single common origin (monophyly) or have evolved convergently (polyphyly). We end by a discussion of some available tools for computational analysis or exploitation of protein domain architectures and their evolution

Crossref

MDC Repository

Letting go: bacterial genome reduction solves the dilemma of adapting to predation mortality in a substrate-restricted environment

Author: A Dufresne
A Ross-Gillespie
A San Millan
C Cressler
C Ruiz-González
D Bieber
DC Whitehead
F Dini-Andreote
FM Lauro
FO Aylward
G Corno
J Pernthaler
JAG Ranea
Jakob Pernthaler
JF Blom
JF Blom
JG Mitchell
JR Chandler
K Rutherford
K Tang
LGM Baas-Becking
M Baumgartner
Michael Baumgartner
MW Hahn
MW Hahn
PD Scanlan
R Aziz
R Gil
R Overbeek
R-M Delrue
RE Lenski
RS Stephens
S Koskiniemi
S Maisnier-Patin
SF Altschul
SGE Andersson
SJ Giovannoni
SJ Giovannoni
SJ Guildford
SL Garcia
Stefan Roffler
T Kiørboe
T Zotina
TF Thingstad
Thomas Wicker
TR Miller
TW Ghylin
V Kasalicky
Y Raynes
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 06/06/2017
Field of study

Resource limitation and predation mortality are major determinants of microbial population dynamics, and optimization for either aspect is considered to imply a trade-off with respect to the other. Adaptation to these selective factors may, moreover, lead to disadvantages at rich growth conditions. We present an example of a concomitant evolutionary optimization to both, substrate limitation and predation in an aggregate-forming freshwater bacterial isolate, and we elucidate an underlying genomic mechanism. Bacteria were propagated in serial batch culture in a nutrient-restricted environment either with or without a bacterivorous flagellate. Strains isolated after 26 growth cycles of the predator–prey co-cultures formed as much total biomass as the ancestor at ancestral growth conditions, albeit largely reallocated to cell aggregates. A ~273 kbp genome fragment was lost in three strains that had independently evolved with predators. These strains had significantly higher growth yield on substrate-restricted media than others that were isolated from the same treatment before the excision event. Under predation pressure, the isolates with the deletion outcompeted both, the ancestor and the strains evolved without predators even at rich growth conditions. At the same time, genome reduction led to a growth disadvantage in the presence of benzoate due to the loss of the respective degradation pathway, suggesting that niche constriction might be the price for the bidirectional optimization

Crossref

ZORA